Finding significant combinations of features in the presence of categorical covariates

نویسندگان

  • Laetitia Papaxanthos
  • Felipe Llinares-López
  • Dean A. Bodenham
  • Karsten M. Borgwardt
چکیده

In high-dimensional settings, where the number of features p is much larger than the number of samples n, methods that systematically examine arbitrary combinations of features have only recently begun to be explored. However, none of the current methods is able to assess the association between feature combinations and a target variable while conditioning on a categorical covariate. As a result, many false discoveries might occur due to unaccounted confounding effects. We propose the Fast Automatic Conditional Search (FACS) algorithm, a significant discriminative itemset mining method which conditions on categorical covariates and only scales as O(k log k), where k is the number of states of the categorical covariate. Based on the Cochran-Mantel-Haenszel Test, FACS demonstrates superior speed and statistical power on simulated and real-world datasets compared to the state of the art, opening the door to numerous applications in biomedicine.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enveloping Spectral Surfaces: Covariate Dependent Spectral Analysis of Categorical Time Series.

Motivated by problems in Sleep Medicine and Circadian Biology, we present a method for the analysis of cross-sectional categorical time series collected from multiple subjects where the effect of static continuous-valued covariates is of interest. Toward this goal, we extend the spectral envelope methodology for the frequency domain analysis of a single categorical process to cross-sectional ca...

متن کامل

Maximum Likelihood Estimation of Parameters in Generalized Functional Linear Model

Sometimes, in practice, data are a function of another variable, which is called functional data. If the scalar response variable is categorical or discrete, and the covariates are functional, then a generalized functional linear model is used to analyze this type of data. In this paper, a truncated generalized functional linear model is studied and a maximum likelihood approach is used to esti...

متن کامل

Searching for significant patterns in stratified data

Significant pattern mining, the problem of finding itemsets that are significantly enriched in one class of objects, is statistically challenging, as the large space of candidate patterns leads to an enormous multiple testing problem. Recently, the concept of testability was proposed as one approach to correct for multiple testing in pattern mining while retaining statistical power. Still, thes...

متن کامل

Effects of combinations of curcumin, linalool, rutin, safranal, and thymoquinone on glucose/serum deprivation-induced cell death

Objective: Several phytochemical agents have been known to exhibit a neuroprotective effect. Among them, curcumin, linalool, rutin, safranal, and thymoquinonewere widely investigated and neuroprotective activity of each of them was shown by several studies. This work was planned to investigate whether different combinations of them could induce better neuroprotective effect against glucose/seru...

متن کامل

Influence of Personal Features on the Change of Individual\'s Decision about Presence or Absence in the Labor Force (A Gender Analysis on the Basis of Panel Data of Iran)

One of the factors which influence individual's decision for presence or absence in the labor force is their personal features. In order to take appropriate policies to create employment and remove its obstacles in the countries labor market, we need to know the factors mentioned above and determine the amount and direction in which each factor influences the probability of individual's presenc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016